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Abstract 

We have investigated space syntax of Venice by means of random 
walks. Random walks being defined on an undirected graph establish 
the Euclidean space in which distances and angles between nodes acquire 
the clear statistical interpretation. The properties of nodes with respect 
to random walks allow partitioning the city canal network into disjoint 
divisions which may be identified with the traditional divisions of the city 
(sestieri) . 
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1 The Sestieri of Venice 

Spectral methods can be implemented in order to visualize graphs 
of not very large multi-component networks [1]. City districts con- 
structed accordingly to different development principles in different 
historical epochs can be envisioned on the dual graph representation 
of space syntax. 

We investigate the segmentation of the spatial network of 96 
canals in Venice (that stretches across 122 small islands between 
which the canals serve the function of roads) in accordance to its 
historical divisions. The sestieri are the primary traditional divi- 




Figure 1: The sestieri are the primary traditional divisions of Venice. The image has 
been taken from 'Portale di Venezia' at http: / /www. guestinvenice.com / , 

sions of Venice (see Fig.[T]): Cannaregio, San Polo, Dorsoduro, Santa 
Croce, San Marco and Castello, Giudecca. The oldest settlements 
in Venice had appeared from the 6**^ century in Dorsoduro, along 
the Giudecca Canal. By the 11**^ century, settlement had spread 
across to the Grand Canal. The Giudecca island is composed of 8 
islets separated by canals dredged in the 9**^ century when the area 
was divided among the rebelling nobles. San Polo is the smallest of 
the six sestieri of Venice, covering just 35 hectares along the Grand 
Canal. It is one of the oldest parts of the city, having been settled 
before the 9*^ century, when it and San Marco (lying in the heart 
of the city) formed part of the Realtine Islands. Cannaregio named 
after the Cannaregio Canal is the second largest district of the city. 
It was developed from the ll*'^ century. Santa Croce occupies the 
north west part of the main islands lying on land only created form 
the late Middle ages to the twentieth century. The district Castello 
grew up from the 13^^ century. 

In the present paper, we address the following question: Given a 
spatial network of a city, is it possible to uncover its historical and 
functional divisions directly from its space syntax? 

In Sec. El we discuss the primary and dual graph representations 
of urban environments. The dual graph representation has been 
extensively studied in space syntax theory which is instrumental in 
predicting human behavior in urban environments. In Sec. [31 we 
demonstrate that space syntax is related to the traffic equilibrium 
state of a transport network, and Markov's transition operators nat- 
urally appear in the space syntax context embedding city space syn- 
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tax into Euclidean space. We build the dual graph representation 
of Venetian canals in Sec. HJ [5]and then perform the Principal Com- 
ponent Analysis of Venetian space syntax in Sec [61 The properties 
of nodes with respect to random walks allow partitioning the city 
canal network into disjoint divisions which may be identified with 
the traditional divisions of the city (sestieri). 

2 Graphs and Space Syntax of Urban Environ- 
ments 

Urban space is of rather large scale to be seen from a single view- 
point; maps provide us with its representations by means of abstract 
symbols facilitating our perceiving and understanding of a city. The 
middle scale and small scale maps are usually based on Euclidean 
geometry providing spatial objects with precise coordinates along 
their edges and outlines. 

There is a long tradition of research articulating urban environ- 
ment form using graph-theoretic principles originating from the pa- 
per of Leonard Euler (see ^). Graphs have long been regarded as 
the basic structures for representing forms where topological rela- 
tions are firmly embedded within Euclidean space. The widespread 
use of graph theoretic analysis in geographic science had been re- 
viewed in |3j establishing it as central to spatial analysis of urban 
environments. In [3], the basic graph theory methods had been 
applied to the measurements of transportation networks. 

Network analysis has long been a basic function of geographic 
information systems (GIS) for a variety of applications, in which 
computational modelling of an urban network is based on a graph 
view in which the intersections of linear features are regarded as 
nodes, and connections between pairs of nodes are represented as 
edges [5]. Similarly, urban forms are usually represented as the pat- 
terns of identifiable urban elements such as locations or areas (form- 
ing nodes in a graph) whose relationships to one another are often 
associated with linear transport routes such as streets within cities 
[6]. Such planar graph representations define locations or points in 
Euclidean plane as nodes or vertices {i}, i = 1, . . . , N, and the edges 
linking them together as z ~ j, in which {i,j} = 1,2, . . . , N. The 
value of a link can either be binary, with the value 1 as i ~ j, and 
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otherwise, or be equal to actual physical distance between nodes, 
dist(i, j), or to some weight Wij > quantifying a certain character- 
istic property of the link. We shall call a planar graph representing 
the Euclidean space embedding of an urban network as its primary 
graph. Once a spatial system has been identified and represented 
by a graph in this way, it can be subjected to the graph theoretic 
analysis. 

A spatial network of a city is a network of the spatial elements 
of urban environments. They are derived from maps of open spaces 
(streets, places, and roundabouts). Open spaces may be broken 
down into components; most simply, these might be street segments, 
which can be linked into a network via their intersections and an- 
alyzed as a networks of movement choices. The study of spatial 
configuration is instrumental in predicting human behavior, for in- 
stance, pedestrian movements in urban environments [H]- A set of 
theories and techniques for the analysis of spatial configurations is 
called space syntax [9]. Space syntax is established on a quite sophis- 
ticated speculation that the evolution of built form can be explained 
in analogy to the way biological forms unravel [7]. It has been de- 
veloped as a method for analyzing space in an urban environment 
capturing its quality as being comprehendible and easily navigable 
[S]. Although, in its initial form, space syntax was focused mainly 
on patterns of pedestrian movement in cities, later the various space 
syntax measures of urban configuration had been found to be cor- 
related with the different aspects of social life, [lOj . 




Decomposition of a space map into a complete set of intersecting 
axial lines, the fewest and longest lines of sight that pass through 
every open space comprising any system, produces an axial map 
or an overlapping convex map respectively. Axial lines and convex 
spaces may be treated as the spatial elements (nodes of a morpholog- 
ical graph), while either the junctions of axial lines or the overlaps 
of convex spaces may be considered as the edges linking spatial el- 
ements into a single graph unveiling the topological relationships 
between all open elements of the urban space. In what follows, we 
shall call this morphological representation of urban network as a 
dual graph. 

The transition to a dual graph is a topologically non-trivial trans- 
formation of a planar primary graph into a non-planar one which 
encapsulates the hierarchy and structure of the urban area and also 
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corresponds to perception of space that people experience when trav- 
eUing along routes through the environment. 

In Fig. 1, we have presented the glossary establishing a corre- 
spondence between several typical elements of urban environments 
and the certain subgraphs of dual graphs. The dual transformation 
replaces the ID open segments (streets) by the zero-dimensional 
nodes, Fig. 1(1). 




Figure 2: The dual transformation glossary between the typical elements of 
urban environments and the certain subgraphs of dual graphs. 

The sprawl like developments consisting of a number of blind 
passes branching off a main route are changed to the star subgraphs 
having a hub and a number of client nodes, Fig. 1(2). Junctions 
and crossroads are replaced with edges connecting the correspond- 
ing nodes of the dual graph. Fig. 1(3). Places and roundabouts are 
considered as the independent topological objects and acquire the 
individual IDs being nodes in the dual graph Fig. 1(4). Cycles are 
converted into cycles of the same lengthes Fig. 1(5). A regular grid 
pattern is shown in Fig. 1(6). Its dual graph representation is called 
a complete bipartite graph, where the set of vertices can be divided 
into two disjoint subsets such that no edge has both end-points in 
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the same subset, and every line joining the two subsets is present, 
[TT] . These sets can be naturally interpreted as those of the vertical 
and horizontal edges in the primary graphs (streets and avenues). 
In bipartite graphs, all closed paths are of even length, [T^ . 

It is the dual graph transformation which allows to separate the 
effects of order and of structure while analyzing a transport network 
on the morphological ground. It converts the repeating geometrical 
elements expressing the order in the urban developments into the 
twins nodes, the pairs of nodes such that any other is adjacent either 
to them both or to neither of them. Examples of twins nodes can 
be found in Fig. 1(2,4,5,6). 

3 Traffic Equilibrium, Space Syntax, and Ran- 
dom Walks 

The concept of equilibrium, the condition of a system in which 
all competing influences are balanced, is a key theoretical element 
in any branch of science. The notion of traffic equilibrium had been 
introduced by J.G. Wardrop [13j and then generalized by [H] to 
a fundamental concept of network equilibrium with many potential 
applications such as the establishing of rigorous mathematical foun- 
dations for the analysis of congested transport networks. Wardrop's 
traffic equilibrium [13] is strongly tied to city space syntax since it 
is required that while attaining the equilibrium all travellers have 
enough knowledge of the transport network they use. Because of the 
complexity of traffic situation in the network, the route choice deci- 
sions taken by travellers are not always objectively optimal. How- 
ever, there is another link between the traffic equilibrium and space 
syntax which has never been discussed in the literature. 

Given a connected undirected graph G{V,E), in which V is the 
set of nodes and E is the set of edges, we can define the traffic 
volume f : E —> {0,oc[ through every edge e E E. It then follows 
from the Perron-Frobenius theorem that a linear equation 

/(e) = fie')exp{-hi{e')) (1) 

e'&E 

has a unique positive solution /(e) > 0, for every edge e & E, for 
a fixed positive constant h > and a chosen set of positive metric 
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length distances i{e) > 0. This solution is naturally identified with 
the traffic equilibrium state of the transport network defined on G, 
in which the permeability of edges depends upon their lengths. The 
parameter h is called the volume entropy of the graph G, while the 
volume of G is defined as the sum 

Vol(G) = ^J]£(e). (2) 

The degree of a node v E V is the number of its neighbors in G, 
deg(f) = K. It has been shown in [15] that among all undirected 
connected graphs of normalized volume, Vol(G') = 1, which are not 
cycles and ky ^ 1 for all nodes, the minimal possible value of the 
volume entropy, 

min(/i) = ^'^K log {K - 1) (3) 



is attained for the length distances 

^ log((fc,(e)-l) (hie) -I)) 

^ ' 2 mm{h) 



(4) 



where i{e) G V and t{e) G V are the initial and terminal vertices of 
the edge e E E respectively. It is then obvious that (jlj) and min(/i) 
being substituted into ([T]) change the operator exp {—M{e)) to a 
symmetric Markov transition operator, 

/(e) = , (5) 

e'SB Y {ki(e) - l) {kt(e) " l) 

which rather describes time reversible random walks over edges than 
over nodes. The flows satisfying ([T]) with the Markov operator 
meet the mass conservation property, 

for some node constants tTj > 0. Other solutions /(e) > obtained 
for h > mm{h) describe equilibrium flows with termination of trav- 
ellers. The Eq.([S]) unveils the indispensable role Markov's chains 
deflned on edges play in equilibrium traffic modelling and exposes 
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the degrees of nodes as a key determinant of the transport networks 
properties. 

Random walks embed connected undirected graphs into Euchdean 
space, in which distances and angles acquire the clear statistical in- 
terpretation. 

Any graph representation naturally arises as an outcome of cat- 
egorization, when we abstract a real world system by eliminating 
all but one of its features and by the grouping of things (or places) 
sharing a common attribute by classes or categories. For instance, 
the common attribute of all open spaces in city space syntax is that 
we can move through them. All elements called nodes that fall into 
one and the same group V are considered essentially identical; per- 
mutations of them within the group are of no consequence. The 
symmetric group Sat consisting of all permutations of elements 
(A^ is the cardinality of the set V) constitute the symmetry group 
of V. If we denote hj E O V x V the set of ordered pairs of nodes 
called edges, then a graph is a map G(y,E) : E ^ K CM.^ (we 
suppose that the graph has no multiple edges). 

The nodes of G{V, E) may be weighted with respect to some mea- 
sure m = ^jgy"^^5^, specified by a set of positive numbers > 0. 
The space £^(m) of square-assumable functions with respect to the 
measure m is the Hilbert space H (a complete inner product space). 
Among all linear operators defined on Ti, those invariant under the 
permutations of nodes are of essential interest since they reflect the 
symmetry of the graph. Although there are inflnitely many such op- 
erators, only those which maintain conservation of a quantity may 
describe a physical process. The Markov transition operators which 
share the property of probability conservation considered in theory 
of random walks on graphs are among them. Laplace operators de- 
scribing diffusions on graphs meet the mean value property {mass 
conservation); they give another example [12] studied in spectral 
graph theory. 

Markov's operators on Hilbert space form the natural language 
of complex networks theory. Being deflned on connected undirected 
graphs, a Markov transition operator T has a unique equilibrium 
state TT (a stationary distribution of random walks) such that 

ttT = TT, TT = lim (tT*, (7) 

t~*oo 

for any density cr G 7i (cTj > 0, Xliev^* ~ -'-)• There is a unique 
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measure m,r = Si gy 71,5, related to the stationary distribution ir 
with respect to which the Markov operator T is self-adjoint, 

f = ^ (tT^/^ T TT-l/^ + 7r-l/2 tT ^1/2) ^ (g) 

where is the adjoint operator. The orthonormal ordered set of 
real eigenvectors ipi, i = 1 . . . N, oi the symmetric operator T estab- 
lishes the basis inTi. In quantitative theory of random walks defined 
on graphs [HI [T7] and in spectral graph theory [19] , the properties 
of graphs are studied in relationship to the eigenvalues and eigen- 
vectors of self-adjoint operators defined on them. In particular, the 
symmetric transition operator defined on undirected graphs is 



= f^>^^ . (9) 

U, otherwise. 

Its first eigenvector t/^i belonging to the largest eigenvalue /ii = 1, 

i)if = V'l, V'm = TTi, (10) 

describes the local property of nodes (connectivity), tTj = ki/2M, 
where 2M = X^iev^*' while the remaining eigenvectors {4's}s=2 
belonging to the eigenvalues 1 > > . . ■ fiN > —1 delineate the 
global connectedness of the graph. 

Markov's symmetric transition operator T defines a projection of 
any density a G 7i on the eigenvector ipi of the stationary distribu- 
tion TT, 

af = iJi + a^f, a^ = a-ilJi. (11) 

Thus, it is clear that any two densities a, p & Ti. differ with respect 
to random walks only by their dynamical components, 

(a-p)f* = {a^-p^)f\ 

for alH > 0. Therefore, we can define a distance between any two 
densities which they acquire with respect to random walks by 

"^-piIt = E (^-p 1^1 ""-p)- (12) 



IT ~ 



or, in the spectral form. 



o^-pIIt = Et>o Ef=2 /^i (^-plV's)(V^sk-p) 



E 



N {a-p\i^s){il>s\cT-p) 
s=2 



(13) 



9 



where we have used Diracs bra-ket notations especially convenient 
in working with inner products and rank-one operators in Hilbert 
space. 

If we introduce the new inner product in Ti-iV) by 

Mr = Et (14) 

t>0 s=2 

for all 0", p G Ti-iV), then ( fT3l) can be written as 

II ^ ~ P IIt = II IIt + II P IIt ~ 2 (a, p)y , (15) 



in which 

II l|2 

II \\t 



j2 (^1 k) ^^g^ 

is the squared norm of a G 'HiV) with respect to random walks. 
We accomplish the description of the (A^ — l)-dimensional Euclidean 
space structure associated to random walks by mentioning that given 
two densities cr, p G 'HiV), the angle between them can be intro- 
duced in the standard way, 

cos Z (p, a) = - — - — . (17) 

II IIt II p IIt 

Random walks embed connected undirected graphs into Euclidean 
space that can be used in order to compare nodes and to retrace the 
optimal coarse-graining representations. Namely, let us consider the 
density 6i which equals 1 at the node i & V and zero for all other 

— 1/2 

nodes. It takes form Vi = ir- 6i with respect to the measure m^. 
Then, the squared norm of Vi is given by 



^* IIt 



TT, ^ 1 - P. 



where i/jg^i is the z*^-component of the eigenvector i/jg. In quantitative 
theory of random walks [il8j, the quantity f|T8|) is known as the access 
time to a target node quantifying the expected number of steps 
required for a random walker to reach the node i & V starting 
from an arbitrary node chosen randomly among all other nodes with 
respect to the stationary distribution ir. 
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The notion of spatial segregation acquires a statistical interpre- 
tation with respect to random walks defined on the graph. In ur- 
ban spatial networks encoded by their dual graphs, the access times 

1 1 2 

\\vi \\rp vary strongly from one open space to another: the norm of 
a street that can be easily reached (just in a few random syntactic 
steps) from any other street in the city is minimal, while it could be 
very large for a statistically segregated street. 

The Euclidean distance between any two nodes of the graph G 
established by random walks. 



is known as commute time in quantitative theory of random walks 
and equals to the expected number of steps required for a random 
walker starting at i G to visit j & V and then to return to i 
again, [I8]. 

It is important to mention that the cosine of an angle calculated 
in accordance to (fTTl) has the structure of Pearson's coefficient of 
linear correlations that reveals it's natural statistical interpretation. 
Correlation properties of fiows of random walkers passing by differ- 
ent paths have been remained beyond the scope of previous studies 
devoted to complex networks and random walks on graphs. The 
notion of angle between any two nodes in the graph arises naturally 
as soon as we become interested in the strength and direction of a 
linear relationship between two random variables, the flows of ran- 
dom walks moving through them. If the cosine of an angle f|T7|) is 1 
(zero angles), there is an increasing linear relationship between the 
flows of random walks through both nodes. Otherwise, if it is close 
to -1 (tt angle), there is a decreasing linear relationship. The corre- 
lation is (7r/2 angle) if the variables are linearly independent. It is 
important to mention that as usual the correlation between nodes 
does not necessary imply a direct causal relationship (an immediate 
connection) between them. 

4 Dual Graph of Venetian Canals 

While analyzing the canal network of Venice, we have assigned 
an identiflcation number to each of 96 city canals. Then the dual 



E 




(19) 
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graph representation for the canal network is constructed by map- 
ping canals encoded by the same ID into nodes of the dual graph 
and intersections among each pair of canals into edges connecting 
the corresponding nodes. The problem of segmentation is closely 
related to the problem of three dimensional (3D) visual representa- 
tions. 

In order to obtain the 3D visual representation of the dual graph 
for the canal network of Venice, we use the spectral properties of 
symmetric transition operator (Q. 

The (xj, i/i, Zi) coordinates of the i*'^-node of the dual graph in 3D 
space are given by the relevant i*'^-components of three eigenvectors 
taken from the ordered set {ipk}-, k = 2 ...N. Possible segmen- 
tations and symmetries of dual graphs can be discovered visually 
by using different triples of eigenvectors if the number of nodes in 
the graph is not very large. In Fig. [31 we have presented the re- 
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Figure 3: The segmentation of the dual graph of Venetian canals using three eigen- 
vectors [i/j2, V'3i V"*]- The nodes of dual graph can be partitioned into classes which can 
be almost precisely identified with the historical divisions of Venice. 

suits of segmentation for the dual graph of Venetian canals using 
the eigenvectors [ V's; "^4 ] belonging to the primary eigenvalues 
1 > > A*3 > /^4- Nodes of the dual graph belonging to one and 
the same city district developed in a certain historical epoch are 
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located on one and the same quasi-surface in the Euchdean space 
estabhshed by random walks. 

Primary eigenvectors of Markov's transition operator defined on 
the dual graph representation of a network indicate the directions 
in which the equilibrium fiows have maximal " extensions" . The use 
of these eigenvectors as a basis helps us to divide the nodes of the 
dual graph into classes which can be almost precisely identified with 
the historical city districts. Let us note that the implementation of 
other eigenvectors as the basis for the 3D representations of the dual 
graph worsens the quality of segmentation in a sense that it turns to 
be incompatible with the traditional sestieri of Venice. The slowest 
modes of diffusion process described by the primary eigenvectors 
allow detecting city modules of different accessibility. 

Due to the proper normalization, the components of eigenvectors 
play the role of the Participation Ratios (PR) which quantify the 
effective numbers of nodes participating in a given eigenvector with 
a significant weight. This characteristic has been used in [20] and 
by other authors to describe the modularity of complex networks. 
However, PR is not a well defined quantity in the case of eigenvalue 
multiplicity since the different vectors in the eigenspace correspond- 
ing to the degenerate mode would obviously have different PR. 

5 Graph Partitioning by Random Walks 

Visual segmentation of networks based on 3D representations of 
their dual graphs is not always feasible. Furthermore, the result of 
such a segmentation may essentially depend on which eigenvectors 
have been chosen as the basis for the 3D representation. The com- 
putation of eigenvectors for large matrices can be time and resource 
consuming, and therefore it is important to have a good estimation 
on the minimal number of eigenvectors required for the proper graph 
segmentation. 

The graph partitioning problem seeks to partition a weighted 
undirected graph G into n weakly connected components Fi, . . . r„ 
such that ljr=i C G and either their properties share some com- 
mon trait or the graphs nodes belonging to them are close to each 
other according to some distance measure defined on nodes of the 
graph. A number of different graph partitioning strategies for undi- 
rected weighted graphs have been studied in connection with Object 
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m- = ^ z(^-), (22) 



Recognition and Learning in Computer Vision [2T] . 

In Ec. 131 we have shown that random walks being introduced on 
a connected undirected graph G estabhsh the (A^ — l)-dimensional 
Euchdean space in which every pair of nodes, i and j, appear at 
some distance 

distT(^,j) = ^/K~ (20) 

where Kij is the commute time (fT9l) of random walks between i 
and j. The random walks distance (!20|) can be used as a measure of 
similarity between any two nodes in G. Namely, every node z G of 
an undirected graph G may be represented by a vector z*^*-* G M (^~^) 
in the (A^ — l)-dimensional Euclidean space associated to random 
walks, 

2» ^ ( ^ ^N, \ ^21) 

y a/tT^ (1 - /i2)' ' A/vTi (1 - /iTv) y 

We then assign each vector (l2Tll to one of n clusters, Ti whose center 
(centroid), 

\^i\ 

.(0 = , ■ 

is the nearest to z^^^ with respect to the distance fl20|) . The objective 
we try to achieve is to minimize the total intra-cluster variance of 
the resulting partition V of the graph G into n clusters, the squared 
error function (s.e.f.), 

sdiV) = j2i:\4''-^''\'- (23) 

1=1 s=l 

If we denote the (A^— l)-dimensional unity vector by e = (1, 1, ... , l)""" 
and Z = [z'^-'^^ . . . z^^^] is the x (A^ — l)-matrix of coordinates 
the node acquire in the Euclidean space associated to random walks, 
then it is clear that 

sef(P) = Er=i|Z-m(%Tr (2A) 

- 1^1=1 \ , 

where P/ = ^Ir, — ^f^j is the projection operator of nodes onto 

the cluster F;. Since = P^, we immediately obtain that 

sef(P) = Er=i (Z^P.Z) 

= tr (Z^Z) - tr (X^ZTZX), 



14 



in which 

1 1 
X= (26) 

is the rectangular orthogonal n x A^-matrix (X^X = l) of the nor- 
malized indicator vectors 

.1 ( I, i e Ti, 



^-[O, a T.: (27) 

Considering elements of the Z matrix as measuring similarity be- 
tween nodes, we can show following [22] that the Euclidean distance 
(l2Up leads to Euclidean inner-product similarity which can be re- 
placed by a general Mercer kernel [221 [251 uniquely represented by 
a positive semi-definite matrix Kij. 

If we then relax the discrete structure of X by assuming that X is 
an arbitrary orthonormal matrix, the minimization of the objective 
function sef(P) is reduced to the trace maximization problem, 

max tr(xTZ^ZX). (28) 



A standard result in linear algebra (proven by K.Fan in 1949 
provides a global solution to the trace optimization problem: Given 
a symmetric matrix S with eigenvalues Ai > . . . > A„ > . . . > Aat, 
and the matrix of corresponding eigenvectors, [ui, . . . , utv], the 
maximum of tr (Q^SQ) over all ri- dimensional orthonormal ma- 
trices Q such that Q^Q = 1„ is given by 



max 



tr (Q^SQ) = J2 ^k, (29) 
^" k=i 

and the optimal n-dimensional orthonormal matrix 

Q = [ui,...,u„]R (30) 

where R is an arbitrary orthogonal n x n matrix (describing a ro- 
tation transformation in M"). 

The result fl29ti30p relates the problem of network segmentations 
to the investigation of n primary eigenvectors of a symmetric matrix 
defined on the graph nodes, [201 [2Z]- The eigenvectors Uj>i have 
both positive and negative entries, so that in general the matrix 
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[ui,...,u„] differs substantially from that one comprising of the 
discrete cluster indicator vectors which have strictly positive entries. 

It is important to note that even for not very large n it may 
be rather difficult to compute the appropriate n x n orthonormal 
transformation matrix R which recovers the necessary discrete clus- 
ter indicator structures. Furthermore, it can be shown that the 
postprocessing of eigenvectors into the cluster indicator vectors can 
be reduced to an optimization problem with n{n — l)/2 — 1 pa- 
rameters [28]. Several methods have been proposed to obtain the 
partitions from the eigenvectors of various similarity matrices (see 
[29], [30] for a review). In the next section, we use the ideas of Princi- 
pal Component Analysis (PCA) in order to bypass the orthonormal 
transformation. 

6 Principal Component Analysis of Venetian Canals 

In statistics. Principal Component Analysis (PCA) is used for 
the reducing size of a data set. It is achieved by the optimal lin- 
ear transformation retaining the subspace that has largest variance 
(a lower-order principal component) and ignoring higher-order ones 

|3I1E2]. 

Given an operator S self-adjoint with respect to the measure 
m defined on a connected undirected graph G, it is well known 
that the eigenvectors of the symmetric matrix S form an ordered 
orthonormal basis {4>k} with real eigenvalues /ii > • • • > fJ^N- The 
ordered orthogonal basis represents the directions of the variances 
of variables described by S. 

If we consider the Laplace operator, L = 1 — T, defined on G, 
its eigenvalues can be interpreted as the inverse characteristic time 
scales of the diffusion process such that the smallest eigenvalues 
correspond to the stationary distribution together with the slowest 
diffusion modes involving the most significant amounts of flowing 
commodity. Therefore, while describing a network by means of the 
Laplace operator, we must arrange the eigenvalues in increasing or- 
der, Ai < . . . < A„ < . . . < Aat, and examine the ordered orthogonal 
basis of eigenvectors, [fi, . . . Fat ]■ 

The number of components which may be detected in a network 
with regard to a certain dynamical process defined on that depends 
upon the number of essential eigenvectors of the relevant self-adjoint 
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operator. There is a simple time scale argument which we use in 
order to determine the number of applicable eigenvectors. 

It is obvious that while observing the network close to an equi- 
librium state during short time, we detect flows resulting from a 
large number of transient processes evolving toward the stationary 
distribution and being characterized by the relaxation times oc A^^. 
While measuring the flows in sufficiently long time r, we may dis- 
cover just n different eigenmodes, such that 

Ai < . . . < A„ < r^^ < . . . < Atv- (31) 

In general, the longer is the time of measurements r, the less is the 
number of eigenvectors we have to take into account in network com- 
ponent analysis of the network. Should the time of measurements 
is fixed, we can determine the number of required eigenvectors. 

In the what-foUowing, we consider the symmetric ("normalized") 
Laplace operator, [19j, 

Lij = Sij Tjj, (32) 
where Tij is the symmetric Markov transition operator (Q. 



6.1 Low dimensional representations of transport networks 
by the principal directions 

In order to obtain the best quality segmentation, it is convenient 
to center the n primary eigenvectors. The centroid vector (repre- 
senting the center of mass of the set [fi, . . . f„ ]) is calculated as the 
arithmetic mean, 

1 " 

m = - V ffc. (33) 
n ^-^ 

k=l 

Let us denote the n x N matrix of n centered eigenvectors by 

F = [ fi - m, . . . - m] . 

Then, the symmetric matrix of covariances between the entries of 
eigenvectors {ffc} is the product of F and its adjoint F""", 

F F"*" 

Cov = — (34) 
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Figure 4: The correlation matrix calculated for the dual graph representation of 
Venetian canals calculated for the system of the first 7 eigenvectors of the normalized 
Laplace operator (|32|) . The entries of the matrix are ranked from 1 (red) to -1 (blue). 

It is important to note that the correspondent Gram matrix F""" F / (A^— 
1) = 1 due to the orthogonaUty of the basis eigenvectors. The main 
contributions in the symmetric matrix Cov are related to the groups 
of nodes 

Cov Ufc = (Tfc Ufc, (35) 

which can be identified by means of the eigenvectors { } associ- 
ated to the first largest eigenvalues among cxi > (72, . . . , > (Tat. 
By ordering the eigenvectors in decreasing order (largest first), we 
create an ordered orthogonal basis with the first eigenvector having 
the direction of largest variance of the components of n eigenvectors 
{ffc}. Let us note that due to the structure of F only the first n — 1 
eigenvalues at are not trivial. In accordance to the standard PGA 
notation, the eigenvectors of the covariance matrix are called the 
principal directions of the network with respect to the diffusion pro- 
cess defined by the operator S. A low dimensional representation of 
the network is given by its principal directions [ui, . . . , u„_i ] , for 
n < N. 

Diagonal elements of the matrix Cov quantify the component 
variances of the eigenvectors [fi,...fn] around their mean values 
(l33l) and may be ample essentially for large networks. Therefore, it 
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is practical for us to use the standardized correlation matrix, 

^orvij = — =S^^^=, (36) 

instead of the covariance matrix Gov. It is important to note that 
the diagonal elements of ( l36l) equal 1, while the off-diagonal elements 
are the Pearson's coefficients of linear correlations, 



Canal n>s 




Figure 5: The coarse-grained connectivity matrix derived from the low- dimensional 
representation of Venetian canals given by the transition matrix Ue Uj ■ 

The correlation matrix fl5B]) calculated with regard to the first n 
eigenvectors possesses a complicated structure containing the multi- 
ple overlapping blocks pertinent to a low-dimensional representation 
of the network of Venetian canals which allows for a further simpli- 
fication. In Fig. m we have presented the correlation matrix ( [36|l 
figured out for the first 7 eigenvectors of the normalized Laplace op- 
erator (1321) defined on the dual graph representation of 96 Venetian 
canals. 

Let U be the orthonormal matrix which contains the eigenvectors 
{ufc}, k = 1, . . . , n — 1, of the covariance (or correlation) matrix 
as the row vectors. These vectors form the orthogonal basis of the 
[n — l)-dimensional vector space, in which every variance (f^ — m) 
is represented by a point E 

gfc = U(ffc-m). (37) 
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Then each original eigenvector can be decoded from G M*^" 
by the inverse transformation, 

ffc = gfc + m. (38) 

The use of transformations fl37|) and fl38l) allows to obtain the {n—1)- 
dimensional representation {^Pk} k=i^ iV- dimensional basis 

vectors {is} in the form 

ifk = U^Uf, + (l-U^U) m, (39) 

that minimizes the mean-square error between G and i^k G 
for given n. 

Variances of eigenvectors { ffc } are positively correlated within 
a principal component of the transport network. Thus, the transi- 
tion matrix U^U can be interpreted as the connectivity patterns 
acquired by the network with respect to the diffusion process. Two 
nodes, i and j, belong to one and the same principal component of 
the network if (UU^) . . > 0. By applying the Heaviside function, 
which is zero for negative argument and one for positive argument, 
to the elements of the transition matrix UU^, we derive the coarse- 
grained connectivity matrix of network components. In Fig. |5l we 
have shown the coarse-grained connectivity matrix obtained from 
the transition matrix Ug Uj for the dual graph representation of 
Venetian canals. 

6.2 Dynamical segmentations of transport netvi^orks 

In general, the building of low-dimensional representations for trans- 
port networks with respect to a certain dynamical process defined 
on them is a complicated procedure which cannot be reduced to 
(and reproduced by) the naive introduction of "supernodes" by ei- 
ther merging of several nodes or shrinking complete subgraphs of 
the original graph. The implementation of spectral approach re- 
moves indeterminacies of the empirical clique concatenation tech- 
niques used in space syntax analysis of urban textures, [M]. If the 
covariance matrix clearly exhibits a block structure, and once the 
relevant coarse-grained connectivity matrix is computed, we can 
identify dynamical clusters (blocks) by using a linearized cluster 
assignment and compute the cluster crossing, the cluster overlap 
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along the specified ordering using the spectral ordering algorithm, 
|28j . The problem of dynamical segmentations of a transport net- 

Canal IDs 




Figure 6: The covariance matrix calculated for the dual graph representation of 
Venetian canals with regrades to the first 80 eigenvectors of the normalized Laplace 
operator (|32|) . The entries of covariance matrix are ranked from the largest positive 
values (red) to the utmost negative values (blue). 

work in fast time scales is more computationally complex especially 
for large networks, because of many eigenvectors if not all have to 
be taken into account while calculating the covariance matrix. It is 
important to note that the covariance matrix in this case takes the 
form of a sparse, nearly diagonal matrix (see Fig. [6]). 

Sparsity of the deduced coarse-grained connectivity matrix (which 
is shown in Fig. [7]) in fast time scales entails loosely coupled sys- 
tems lack any form of large scale structure. A sparse coarse-grained 
connectivity matrix may be useful when storing and manipulating 
data for approximate descriptions of transport networks in fast time 
scales. 

Low-dimensional representations of not very large transport net- 
works given by the coarse-grained connectivity matrices can be rep- 
resented by a 3D-graph. In Fig. [HI we have shown the 3D-image of 
a dynamical segmentation of Venetian canals. The dual graph rep- 
resentation of the Venetian canal network has been analyzed, and 
a ball has been assigned to each canal. The radius of the i*^-ball 
is taken equal to the norm (ITB]) . the node i has in the {N — 1)- 
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Figure 7: The fast time scale coarse-grained connectivity matrix for tiie low- 
dimensional representation of Venetian canals deduced from the transition matrix 
U79 U79. 

dimensional Euclidean space associated to random walks introduced 
on the connected, undirected dual graph of Venetian canals, 

n = \Mt (40) 

Those nodes characterizing by the worst accessibility levels have 
the largest norms with respect to random walks and therefore are 
represented by balls of the largest radiuses. 

The coordinates of each ball have been given by the relevant 
components of the first three eigenvectors of the coarse-grained con- 
nectivity matrix displayed in Fig. [51 These eigenvectors determine 
the directions of the largest variances of correlations delineating the 
low-dimensional representation of the network. The key observation 
is that canals with short access times are also characterized by small 
variances of correlations, therefore being forgathered proximate to 
the center of the figure displayed in Fig. [HI no matter which city dis- 
trict they belong to. In the contrary, the worst accessible canals are 
distinguished by the strongest correlation variances and are located 
on the figure fringes, far apart from its center. At the same time, 
the radiuses of the balls representing them are the largest among 
all other balls since they acquire the utmost norms with respect to 
random walks. It is remarkable that they can be perfectly identified 
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Figure 8: The 3D-image of a dynamical segmentation of Venetian canals built for the 
first 8 eigenvalues of the normalized Laplace operator (|32p . The structural differences 
between the historical city districts of Venice are clearly visible. 

with the traditional historical sestieri of Venice. 

7 Discussion and Conclusion 

The impact of urban landscapes on the construction of social re- 
lations draws attention in the fields of ethnography, sociology, and 
anthropology. In particular, it has been suggested that the urban 
space combining social, economic, ideological and technological fac- 
tors is responsible for the technological, socioeconomic, and cultural 
development, [SS]. It is worth to mention that the processes relating 
urbanization to economic development and knowledge production 
are very general, being shared by all cities belonging to the same 
urban system and sustained across different nations and times |36] . 
There is a tied connection between physical activity of humans, their 
mobility and the layout of buildings, roads, and other structures that 
physically define a community |37j. Spatial organization of a place 
has an extremely important effect on the way people move through 
spaces and meet other people by chance [M]- The patterns of so- 
cial movement and economical development for a thousand years of 
Venetian history have been imprinted in space syntax of the city. 

In the present paper, we have investigated the canal network in 
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Venice by means of random walks. Random walks being defined on 
an undirected graph of N modes, establish the {N — 1)- dimensional 
Euclidean space in which distances and angles acquire the clear sta- 
tistical interpretation. The properties of nodes with respect to ran- 
dom walks allow partitioning the city canal network into disjoint 
divisions which may be identified with the traditional divisions of 
Venice (sestieri). 

We have developed the general approach to the coarse-graining of 
transport networks based on the PCA method for the low-dimensional 
representation of large data set. We believe that the proposed tech- 
nique can be useful in many applications potential applications such 
as the establishing of rigorous mathematical foundations for the 
analysis of urban textures establishing the urbanization road to a 
harmonious city. 
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